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ABSTRACT 



The report describes a 3 -year project to identify and 
develop appropriate assessment tools for placing adult 

English-as-a-Second-Language (ESL) students into the appropriate proficiency 
levels according to California's state ESL standards for adult education 
programs. This involved reviewing 18 commercially available instruments to 
determine their suitability by matching content with the state standards, 
field testing five potentially promising instruments , surveying agencies 
across the state to determine current ESL placement practices, development of 
a framework for producing assessment models, and analysis and interpretation 
of testing results. Two of the five instruments previously field-tested were 
then recommended for use: the New York State Place Test (NYS) and Basic 
English Skills Test (BEST) . In the third year, initial cutoff ranges for the 
two tests were established and a test development plan to guide production of 
operational placement instruments was created. The report focuses on the 
first of these two tasks. The method of field testing and data analysis are 
described for both tests, and results of field testing are presented 
separately. Results are discussed and recommendations for refinement are 
made. Contains 6 references. (MSE) (Adjunct ERIC Clearinghouse on Literacy 
Education) 
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Introduction 



The work described in this report was completed under the auspices 
of the California Adult English-as-a-Second-Language (ESL) Assessment 
Project at the UCLA Center for the Study of Evaluation, sponsored by the 
California Department of Education (CDE). The primary goal of this three- 
year project was to identify and develop appropriate assessment tools for 
placing adult ESL students into the appropriate proficiency levels according 
to the English- as- a- Second-Language Model Standards for Adult Education 
Programs 1 (California Department of Education, 1992). 

The first year of project work involved reviewing 18 commercially 
available instruments to determine their suitability in terms of content 
match with the Model Standards. From the 18 reviewed, five potentially 
promising instruments were identified and field tested to determine the 
range of each instrument vis-a-vis the Model Standards proficiency levels 
and to reassess the content in light of student performance on the items. 
(See Butler, Weigle, & Sato, 1993, for a detailed report of Year 1 work.) 

The second year of work .included a survey of agencies across the 
state to document current ESL placement practices, the development of a 
framework for producing assessment models, and analysis and 
interpretation of the field testing results from Year 1. Based on these 
analyses, two of the five instruments field tested in Year 1 were 
recommended to the CDE for use in placing students into Model Standards 
proficiency levels: the New York State Place Test (NYS Place Test) and the 
Basic English Skills Test (BEST). 2 (See Kahn, Butler, Weigle, & Sato, 1994, 
for the results of the survey of placement procedures and Weigle, Kahn, 
Butler, & Sato, 1994, for a discussion of Year 2 work.) 



^Henceforth in this document, the English-as-a-Second-Language Model Standards for 
Adult Education Programs will be referred to as the Model Standards. There are seven 
proficiency levels designated in the Model Standards: beginning literacy, beginning low 
(BL), beginning high (BH), intermediate low (IL), intermediate high (IH), advanced low 
(AL), and advanced high (AH). The Adult ESL Assessment Project addresses placement 
only into levels beginning low through advanced high. 

2 Based on content review and field testing results from Year 1, the NYS Place Test was 
recommended for placing students into all six Model Standards proficiency levels while 
the BEST was recommended for beginning low through intermediate high only. 



There were two primary tasks for the third year of work: The first 
involved establishing initial cutoff ranges for the NYS Place Test and the 
BEST. The second task involved the creation of a test development plan to 
guide the production of operational instruments for placing students into 
the proficiency levels defined by the Model Standards (see Kahn, Butler, 
Weigle, & Sato, 1995, for a description of the process). This report focuses 
on the first of these two tasks and is organized in the following way: First, 
the methods for field testing and data analysis are described for both tests 
together. Then the results of the field testing are presented separately for 
the NYS Place Test, the BEST Oral Interview Section, and the BEST 
Literacy Section. Finally, a discussion of the results is presented and 
recommendations for refining cutoff scores are made. 



Methods 

In February 1995, the NYS Place Test, a 27-item oral interview, and 
the BEST, a 50-item oral interview and a literacy skills section containing 49 
reading items and 19 writing items, were field tested at adult education 
agencies 3 across the state following training of test administrators from 
each agency. Tables 1 and 2 present the number of students by agency and 
proficiency level who participated in the field testing of the NYS Place Test 
and the BEST, respectively. As the tables show, the NYS Place Test was 
administered to about 10 students at each level from beginning low through 
advanced high at four agencies. The BEST was administered to 
approximately 15 students at each level from beginning low through 
intermediate high at three agencies. Note that slightly fewer people were 
administered the BEST Literacy Skills Section than the Oral Interview 
Section. 

The agencies field testing the NYS Place Test and the BEST were well 
into the process of aligning their courses to the Model Standards, so it was 
presumed that the course level of the students was an accurate reflection of 
their language proficiency according to Model Standards levels. However, 
a preliminary analysis of the field testing data revealed a wide range of 



3 Henceforth in this document, adult education agency or agencies in California will be 
referred to as “agency” or “agencies.” 



Table 1 

Number of students in field test administration of NYS Place Test by agency by 
proficiency level 









Proficiency 


Level 






Total 


Agency 


BL 


BH 


IL 


IH 


AL 


AH 


ABC 


10 


10 


10 


10 


10 


10 


00 


Hayward 


10 


11 


10 


12 


10 


10 


63 


Santa Clara 


11 


9 


10 


10 


10 


10 


60 


Watsonville 


10 


10 


10 


10 


10 


10 


60 


Total 


41 


40 


40 


42 


40 


40 


243 



Table 2 

Number of students in field 
proficiency level 


test administration 


of BEST by 


agency by 






Proficiency Level 






Agency 


BL 


BH 


IL 


IH 


Total 


LAUSD 


15 


15 


15 (14) 


15 


60 (59) 


Oxnard 


15 


15 


15 


15 


60 


San Francisco 


15 (12) 


15 (14) 


15 


16 (15) 


61 (56) 


Total 


45 (42) 


45 (44) 


45 (44) 


46 ( 45) 


181 (175) 



Note: Numbers in parenthesis represent number of students taking the BEST 

Literacy if different from the number taking the BEST Oral Interview. 



scores within each level indicating considerable variation in student 
language ability for the skills being measured by these tests. Because such 
a wide range of ability is not usually expected in a single level, follow-up 
information about student proficiency was collected in the form of teacher 
judgments of listening/speaking for the NYS Place Test, and both 
listening/speaking and reading/writing for the BEST. (See Appendix for an 
example of a teacher judgment form.) Since the teacher judgments were 
not collected at the same time that the testing took place, not all tested 
students received judgments from their teachers. Altogether, 225 or 93% of 
the students on the NYS Place Test received teacher judgments. Of the 
students taking the BEST, 154 or 85% received teacher judgments of 
listening/speaking, and 141 or 81% received teacher judgments of 
reading/writing. 



For the NYS Place Test and both sections of the BEST, tentative cutoff 
points for each proficiency level were derived using procedures outlined in 
the BEST Test Manual (Center for Applied Linguistics, 1989, p. 57). 
Students were grouped by proficiency level according to current course 
enrollment. 4 The cumulative frequency distribution of scores for each level 
was calculated, and for each score, the level at which the cumulative 
frequency was closest to 50 % (the median) was chosen as the most 
appropriate level for that score. In borderline cases, cutoffs were chosen to 
maximize the number of students placing into the same level as their 
current class level. 

Table 3 shows a simple example of this procedure with invented data. 
As the table indicates, scores of 10 through 12 on this fictional instrument 
would place students into Level 1, since the cumulative percentage of scores 
is closest to 50 at Level 1 among the three levels. Similarly, scores of 13 and 
14 would place students into Level 2, and scores of 15 or 16 would place 
students into Level 3. 



Table 3 



Example of cutoff score 


i decisions* 




Cumulative Percentage 


Score 


Level 1 


Level 2 


Level 3 


10 


35 fO 


25 


10 


11 


40 


30 


20 


12 


50 


35 


25 


13 


60 


45 


35 


14 


65 


55 


40 


15 


75 


70 


50 


16 


80 


75 


60 



♦Example is based on invented data. 



Once the tentative cutoffs were set, a crosstabulation was calculated 
of students by their current enrollment versus their enrollment based on 
the derived cutoffs. To control for any extreme differences in proficiency 

4 The same analysis was done using teacher judgment as the indicator of proficiency. The 
results were similar to the results reported here. For this reason, only the results using 
current course enrollment as the criterion are discussed in this report. 



among students at the same level, students whose teacher judgment was 
two or more levels away from their current class enrollment were excluded 
from this analysis. The crosstabulation allows for a visual inspection of the 
number of students who would be placed higher or lower than their current 
class based on their test scores. Finally, percentages of students placing at, 
below, or above their current level based on the test in question were 
calculated as a way of summarizing the crosstabulation data succinctly. 
These analyses are presented in the Results section for each test below. 



Results 

The results of the data analyses are presented for each test 
individually. Since the BEST Oral Interview Section and Literacy Skills 
Sections provide separate scores, these are discussed separately. 

NYS Place Test 

Descriptive statistics for the NYS Place Test, presented in Table 4, 
show that the mean scores increase with each level, although the means 
for intermediate high and advanced low are quite close to each other. The 
table also reveals that the score ranges for each level are fairly wide at all 
levels past beginning low. 



Table 4 

NYS Place Test: Descriptive statistics by proficiency level 



Proficiency Level 


n 


Mean 


SD 


Range 


Beginning Low 


41 


4.88 


3.88 


0-17 


Beginning High 


40 


12.33 


7.56 


2-28 


Intermediate Low 


40 


23.88 


8.99 


6-47 


Intermediate High 


42 


28.40 


9.32 


11-50 


Advanced Low 


40 


29.50 


7.55 


9-45 


Advanced High 


40 


35.38 


7.55 


9-45 



Maximum number of points - 54 



Table 5 shows the tentative score ranges for the six proficiency levels 
derived by the method described above. Table 6 shows the number of 
students at each class level who would have placed into the same 
proficiency level or a different proficiency level based on these cutoffs. As 
noted above, this analysis excludes students whose teacher judgment of 
their listening/speaking ability was two or more levels away from their 
current course level. The table shows that a large number of students at 
each level would have been placed one, two or in some cases even three 
levels away from their current level based on the tentative cutoffs. 



Table 5 

NYS Place Test: Tentative cutoffs 



Proficiency Level 


Score Range 


Beginning Low 


0- 4 


Beginning High 


5-14 


Intermediate Low 


15-23 


Intermediate High 


24-29 


Advanced Low 


30-35 


Advanced High 


36-54 



Table 6 

Placement of students by current class level into proficiency levels according to 
NYS Place Test score* 



Class 

Level 






Placement 






Total 


BL 

(0-4) 


BH 
(5 - 14) 


IL 

(15 - 23) 


IH 

(24 - 29) 


AL 

(30 - 35) 


AH 

(36 - 54) 


BL 


25 


15 


1 








41 


BH 


5 


20 


10 


4 






39 


IL 




4 


16 


11 


3 


5 


39 


IH 




3 


11 


9 


10 


9 


42 


AL 




1 


7 


11 


12 


7 


38 


AH 






2 


5 


8 


22 


37 


Total 


30 


43 


47 


40 


33 


43 


236 



♦excluding cases where | level-tj | > = 2 

Note: Bold face indicates the number of students placing into their current class level based on 
tentative cutoffs 



Table 7 summarizes this information, showing the percentage of 
students placing below, at, or above their current class level. As the table 
shows, the tentative cutoffs are most accurate at the extreme ends of the 
proficiency scale and least accurate at intermediate high and advanced 
low, with only 21% and 32% of students, respectively, being placed into their 
current level by the NYS Place Test. These results highlight the 
preliminary nature of the derived cutoffs, an issue that will be taken up in 
more detail in the Discussion section below. 



Table 7 

Percentage of students placing below, at, or above class level based on 
tentative cutoffs for the NYS Place Test 



Class Level 




Placement 




below level 


at level 


above level 


Beginning Low 


— 


83 


17 


Beginning High 


13 


51 


36 


Intermediate Low 


10 


41 


49 


Intermediate High 


33 


21 


45 


Advanced Low 


50 


32 


18 


Advanced High 


1 


59 


— 



BEST Oral Interview Section 

Descriptive statistics for the BEST Oral Interview Section are 
presented in Table 8. As the table shows, the mean score increases with 
each level, with the greatest increase between beginning low and beginning 

Table 8 



BEST Oral Interview: Descriptive statistics by proficiency level 



Proficiency Level 


n 


Mean 


SD 


Range 


Beginning Low 


45 


23.04 


17.89 


1-68 


Beginning High 


45 


46.80 


13.61 


5-74 


Intermediate Low 


45 


59.80 


11.03 


38-82 


Intermediate High 


46 


63.87 


11.42 


31-77 



Maximum number of points = 83 



high. However, those two levels also show the greatest variance within 
levels, as indicated by the standard deviations and score ranges. As with 
the NYS Place Test, the wide variation in scores at all levels must be kept in 
mind when reviewing the preliminary cutoffs .discussed below. 

Table 9 presents the tentative score ranges derived as described 
above. The crosstabulation of students’ current placements with their 
placements based on these score ranges is found in Table 10 and 
summarized by percentage of students placing at, below, or above their 
current level in Table 11. As the tables show, the score range for beginning 
low encompasses 80% of students currently placed at beginning low; 
however, at-level placement is less than 50% for beginning high and 
intermediate low and just under 60% for intermediate high. 



Table 9 

BEST Oral Interview: 


Tentative cutoffs 


Proficiency Level 


Score Range 


Beginning Low 


0-33 


Beginning High 


34-52 


Intermediate Low 


53-65 


Intermediate High 


66-83 



Table 10 

Placement of students by current class level into proficiency levels according 
to BEST Oral Interview score* 







Placement 




Total 


Class Level 


BL 

(0 - 33) 


BH 

(34 - 52) 


IL 

(53 - 65) 


IH 

(66 - 83) 


Beginning Low 


32 


6 


2 




40 


Beginning High 


7 


22 


14 


2 


45 


Intermediate Low 




11 


20 


14 


45 


Intermediate High 


1 


7 


11 


27 


46 


Total 


40 


46 


47 


43 


176 



*excluding cases where | level-tj | > = 2) 

Note: Bold face indicates the number of students placing into their current class 
level based on tentative cutoffs 



Table 11 



Percentage of students placing below, at, or above class level based on 
tentative cutoffs for the BEST Oral Interview 



Class Level 




Placement 




below level 


at level 


above level 


Beginning Low 


— 


80 


20 


Beginning High 


16 


49 


35 


Intermediate Low 


24 


44 


31 


Intermediate High 


41 


59 


— 



BEST Literacy Skills Section 

Descriptive statistics for the BEST Literacy Skills Section are 
presented in Table 12. The table shows a large difference in means between 
beginning low and beginning high, with smaller differences between the 
other levels. Like the Oral Interview Section, the Literacy Skills Section 
scores vary considerably within levels, particularly at beginning low, 
casting doubt on the accuracy of any cutoffs scores derived from this data 
set. 

Table 12 



BEST Literacy: Descriptive statistics by proficiency level 



Proficiency Level 


n 


Mean 


SD 


Range 


Beginning Low 


42 


26.17 


21.41 


0-54 


Beginning High 


44 


51.73 


11.50 


0-65 


Intermediate Low 


44 


57.11 


9.24 


24-72 


Intermediate High 


45 


63.67 


6.34 


46-72 



Maximum number of points = 83 



Table 13 presents the cutoff score ranges as determined by the 
method described above. The extreme variability and wide range of scores 
at beginning low makes the tentative cutoffs quite problematic: note that 
beginning low encompasses more than half of the total possible score range 
(maximum = 83), with much narrower ranges from beginning high 
through intermediate high. The effects of these narrow ranges can be seen 
in Tables 14 and 15, which show that only about one-third of students at 



beginning high and intermediate low would be placed into their current 
levels based on these derived cutoffs. Thus the cutoffs for the BEST Literacy 
Skills Section based on this data set are problematic and should not be 
implemented without additional data. 



Table 13 

BEST Literacy: Tentative cutoffs 



Proficiency Level 


Score Range 


Beginning Low 


0-45 


Beginning High 


46-53 


Intermediate Low 


54-60 


Intermediate High 


61-83 



Table 14 



Placement of students by current class level into proficiency levels according 
to BEST Literacy score* 



Class Level 




Placement 




Total 


BL 

(0 - 45) 


BH 

(46 - 53) 


IL 

(54 - 60) 


IH 

(61 - 83) 


Beginning Low 


30 


8 






38 


Beginning High 


8 


14 


15 


7 


44 


Intermediate Low 


2 


9 


14 


18 


43 


Intermediate High 




5 


7 


33 


45 


Total 


40 


36 


36 


58 


170 



* excluding cases where | level-tj | > = 2 

Note: Bold face indicates the number of students placing into their current class 
level based on tentative cutoffs 



Table 15 



Percentage of students placing below, at, or above class level based on 
tentative cutoffs for the BEST Literacy 



Class Level 




Placement 




below level 


at level 


above level 


Beginning Low 


— 


79 


21 


Beginning High 


18 


32 


50 


Intermediate Low 


25 


33 


42 


Intermediate High 


27 


73 


— 




10 



15 



Discussion 



In any situation where test scores are used to make decisions about 
individual students, whether it be placement, progress, or final 
achievement, the process of setting cutoff scores is an ongoing one that 
cannot be accomplished in a single field testing effort. The process involves 
a consideration of the test content and the characteristics of the students as 
well as the quantitative results of the field testing itself. Even when field 
testing provides enough information about the performance of students at 
various proficiency levels to be confident about cutoff scores, the cutoffs 
must be monitored closely in subsequent test administrations to ensure that 
the decisions made on the basis of the cutoffs are valid and appropriate. 

The complexities of setting cutoff scores are increased when the test 
in question is to be used in conjunction with an instructional program that 
is still in the process of being implemented on a large scale, as was the case 
with the Model Standards in this field testing effort. The field testing data 
from both tests revealed a great deal of score variation within levels; 
however, it is unclear to what extent this variation is due to factors related 
to the field testing itself or to actual differences in proficiency among 
students placed into the same level within and across agencies. Because 
the Model Standards had only been in place at the participating agencies for 
a short time, and especially given that appropriate placement procedures 
for use with the Model Standards were still in the process of being identified 
and developed, it is quite likely that students at the levels tested were less 
homogeneous than would be desirable for setting accurate cutoffs. Indeed, 
the variability in performance within each level revealed by this study could 
provide useful diagnostic information for agencies seeking to compare their 
implementation of the Model Standards with other agencies. 

Apart from the difficulties inherent in setting cutoffs for a program 
that has only recently been implemented, there are several other reasons 
for interpreting the field testing data and the derived cutoffs with caution. 
First, the number of students tested at each level per agency was small (10 
to 15), so that the students tested may not be representative of the level. 
Second, students across agencies were placed into class levels through the 
use of a variety of instruments assessing different skills. This may help 
explain the wide range of scores within levels since the skills upon which 



placement decisions were made may not have been the skills assessed in 
this field testing effort. Finally, because the field testing took place some 
weeks after the beginning of the school term, it is likely that students had 
increased their proficiency to varying degrees from the time placement 
decisions were made. Thus it is difficult to say with any degree of certainty 
that the tentative cutoffs presented in this report would place students 
appropriately into Model Standards levels. 5 



Recommendations 

The field testing effort in February 1995 was a useful first step in the 
process of determining appropriate cutoff scores for the NYS Place Test and 
the BEST for placing students into Model Standards levels. As the results 
in this report indicate, placement decisions based on the cutoff scores 
presented here are not likely to be reliable vis-a-vis Model Standards levels. 
For this reason, general dissemination of the initial cutoff ranges is not 
recommended at this time. Instead, further tryouts of the tests are 
suggested to help verify the cutoff ranges. An arrangement could be made 
to work with a small number of agencies for a one-year period to allow for 
close monitoring of the cutoff scores. To this end, agencies could be asked to 
volunteer to work with one of the two instruments and associated cutoff 
ranges as part of their placement process to help CDE make adjustments in 
the cutoff ranges as needed. Following this effort, CDE would be able to 
disseminate cutoff ranges to agencies statewide. 

Once agencies begin using the tests on a large scale, a process should 
be put in place to continue monitoring the effectiveness of placement 
decisions based on the score ranges recommended for the NYS Place Test 
or the BEST. This can be accomplished by keeping records of students 
whose placements need to be changed once they are in the classroom or 
students and by monitoring the performance of students whose scores put 
them on the border between two levels. Such monitoring can reveal 
whether the cutoff scores at specific levels are set appropriately or need to be 
revised. 



5 However, it should be noted that the published cutoff scores for the BEST (reference) suffer 
from similar problems, in that the data used to set the cutoff scores showed similar 
variability at the different levels of proficiency. 
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Appendix 



Example Teacher Judgment Form 
from February 1995 Field Testing 



Name: 



Level: 



As part of the California ESL Assessment Project, we are attempting to establish 
preliminary cutoff scores for the BEST, which some of your students took in February, 
1995. In order for us to be able to set accurate cutoff scores for the test, we would like some 
additional information about your students who took the test. 

The attached table includes Model Standards descriptions of the listening/speaking 
abilities and reading/writing abilities of students at the six levels from Beginning Low 
through Advanced High. Please read the description for your level and answer the 
following questions. 

1. Does the Listening/Speaking description in general 

fit the majority of the students in your class? 

If not, which Listening/Speaking description fits 

the majority of your students? 



2. Does the Reading/Writing description in general 

fit the majority of the students in your class? 

If not, which Reading/Writing description fits 

the majority of your students? 

3. Below you will see the names of the students in your class who took the BEST. In the 
space next to each name, please indicate whether the description for your level fits each 
student. If not, indicate the description that best fits the student and comment if 
appropriate. 





Listening/Speaking 


Reading/Writing 


Name 


Fits If no, which 

(y/n) fits best? 


Fits If no, which 

(y/n) fits best? 



Thank you for your help. We appreciate your time and effort! 



Model Standards Descriptions: Speakin^Listening 



A student at this level: 



Beginning Low 
(BL) 


Can comprehend isolated words and phrases. 

Depends on gestures, a few English words, and primary language to 
communicate. 


Beginning High 
(BH) 


Can comprehend a range of high-frequency words used in context. 
Communicates survival needs using learned phrases and 
sentences. 


Intermediate Low 
(IL) 


Can comprehend conversation containing some unfamiliar words 
in familiar contexts. 

Can participate in basic conversations in routine social situations. 


Intermediate High 
(IH) 


Can comprehend conversations containing some unfamiliar 
vocabulary. 

Can participate in face-to-face conversations on topics beyond 
survival needs. 


Advanced Low 
(AL) 


Can comprehend conversation on unfamiliar topics and essential 
points of discussion in speech on topics in special fields of interest. 
Can participate in extended conversation on a variety of topics. 


Advanced High 
(AH) 


Can comprehend abstract topics in familiar contexts and 
descriptions and narrations of factual material. 

Can participate in casual and extended conversation and in 
conversation on technical subjects with hesitancy. 

Can discuss new and unfamiliar topics with hesitancy. 



Model Standards Descriptions: Reading/Writing 



A student at this level: 



Beginning Low 
(BL) 


Can recognize letters and numbers. 

May be able to write her/his name and address. 


Beginning High 
(BH) 


Can get limited meaning from print with successive rereading and 
checking. 

Can copy words and phrases and write sentences based on previously 
learned materials. 


Intermediate Low 
(IL) 


Can read simplified material on familiar subjects. 

Can write short messages and notes within the scope of her/his limited 
language experience. 


Intermediate 
High (IH) 


Can read materials on familiar subjects and authentic materials with 
limited success. 

Can perform basic writing tasks in familiar contexts. 


Advanced Low 
(AL) 


Can read authentic materials on everyday subjects and technical 
material with difficulty. 

Can produce routine correspondence and paragraphs about previously 
discussed topics. 


Advanced High 
(AH) 


Can read authentic materials on familiar subjects and nontechnical 
prose. 

Can produce descriptions, essays, and summaries. 



Source: English-as-a-second-language Model Standards for Adult Education Programs , 
California Department of Education, 1992. 
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